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We point out that a proper use of the HoefTding-ANOVA decomposition for symmetric statistics 
of finite urn sequences, previously introduced by the author, yields a decomposition of the space 
of square-integrable functionals of a Dirichlet-Ferguson process, written L (D), into orthogonal 
subspaces of multiple integrals of increasing order. This gives an isomorphism between L^{D) 
and an appropriate Fock space over a class of deterministic functions. By means of a well-known 
result due to Blackwell and MacQueen, we show that each element of the nth orthogonal space 
of multiple integrals can be represented as the limit of fZ-statistics with degenerate kernel of 
degree n. General formulae for the decomposition of a given functional are provided in terms of 
linear combinations of conditioned expectations whose coefficients are explicitly computed. We 
show that, in simple cases, multiple integrals have a natural representation in terms of Jacobi 
polynomials. Several connections are established, in particular with Bayesian decision problems, 
and with some classic formulae concerning the transition densities of multiallele diffusion models, 
due to Littler and Fackerell, and Grifhths. Our results may also be used to calculate the best 
approximation of elements of L^{D) by means of fZ-statistics of finite vectors of exchangeable 
observations. 

Keywords: Bayesian statistics; Dirichlet process; exchangeability; Hoeffding-ANOVA 
decompositions; Jacobi polynomials; multiple integrals; orthogonahty; fZ-statistics; urn 
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1. Introduction and preliminaries 

Let {A, A) be a Polish space endowed with its Borel cr-field and consider a finite positive 
measure a on {A, A). According to [8], given a probability space {fl,J-,P), we say that 
a random probability measure {D(C;lo) :C S A}, where w S fi, is a Dirichlet-Ferguson 
process (in the sequel, DF process) with parameter a if, for every finite measurable 
partition (Ci, . . . , C„) of A, the vector (D(Ci; ■),■■■, D[Cn\ ■)) has a Dirichlet distribution 
with parameters (a(Ci), . . . , a{Cn))^ with the convention that a{Ci) = means D{Ci) = 
0, P-a.s. (throughout the sequel, whenever there is no risk of confusion, we will write 
D{C;uj), D{C;-) or D{C) depending on notational convenience). Note that, when a is 
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non-atomic, in the terminology of [24], D is a normalized gamma process on (A, A). DF 
processes were first introduced and analyzed in the fundamental papers [3, 4, 8] and have 
since played a central role in Bayesian nonparametric statistics (we refer the reader to the 
above-quoted references, as well as [9, 13] and [20], for basic discussions in this direction; 
sec also [19] for a survey of a large class of random measures related to DF processes). 

Now, let iF'iD) = L^(£>,P) denote the Hilbert space of square-integrable functional of 
the random measure D. The aim of this paper is to obtain an orthogonal decomposition 
of L^{D) based on the theory of orthogonal and symmetric ?7-statistics developed in [18] 
(see also [5]). Such a result is the analogue, for the random measure D, of the "chaotic" 
decompositions of square-integrable functionals of Gaussian processes (see, e.g., [23] and 
[14] and the references therein) or Levy processes (see [22] and [15]). In particular, we 
will show that every element of L^{D) admits a unique representation as an infinite 
orthogonal sum of multiple integrals of increasing order with respect to D and therefore 
that L^{D) is isomorphic to an appropriate Fock space over a class of deterministic 
functions. Our results contain as special cases several classic computations contained in 
[8] and [9], mainly related to Bayesian decision problems. Moreover, they provide an 
exhaustive characterization of the covariancc structure of the elements of L'^{D), for any 
choice of {A^AC} and a. In this sense, our results are the infinite-dimensional analogues of 
the orthogonal polynomial decompositions of functionals of finite Dirichlet vectors, used, 
for example, by Littler and FackcrcU (sec [12]) and Griffiths (see [10]) to make explicit 
the transition density associated with a (finite) multi-allele diffusion model, having the 
Dirichlet law as stationary measure. Some applications are outlined in Section 1.3 as well 
as in Sections 6 and 7 below. 

To partially illustrate our methods and results in a specific framework, we will first 
present the example of a simple DF process on {0, 1}. 



1.1. Preliminary example: Beta random variables and Jacobi 
polynomials 

Fix real numbers Q;i,ao > and consider a Beta random variable ?/(w) with values in 
[0, 1] and parameters (ai,Q;o). This means that, for every Borel set C, 



eC) = — -/ x'^--\l-xr-Ux, (1) 

i((ai,ao) Jcn[oa] 

where i3(-, •) is the Beta function, defined as B{s, t) ~ x*'^^ (1 — x)*^^ dx (see, e.g., [1]). 
We may interpret 77 as a random parameter, determining a random probability measure 
D{-,uj) on {0,1}, via the relations 

Di{l},Lu) = 7^{u;) = l-Dm,^)- (2) 

The measure D{-,uj), as defined in (2), is the most elementary example of a DF process 
and corresponds, in particular, to the case A = {0, 1} and a{-) = ai^i(-) -I- ao6o{-), where 
Sx stands for the Dirac measure concentrated at x. To simplify, we adopt the notation 
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Pai,ao{x) = B{ai,aif)~^x°'^^^ (1 — a;)"""^ and define ^^(77) and Vniv): > 0, to be, 
respectively, the space of square- integrable functionals of 77 and the subspace of L'^{r]) 
composed of random variables of the form TTn{ri), where 7r„(-) is a polynomial of order 
n. Note that L'^{r]) = L'^{D), Po{r]) = K, Pniv) C Vn+iiv) and the union of the VnivVs 
is total in L'^{r]); we also set Joiv) ■= ^ and J7n(7?) := 'Pn{il) ^ 'Pn-iiv)'^ ^ n>l, where 
"_L" stands for the orthogonality relation in {rf) . It is well known that the orthogonal 
sequence of subspaces {Jn(jf) :n > 0} can be exhaustively characterized in terms of Ja- 
cobi polynomials (again, see [1], Section 22), defined, for n > 0, q > and p> q — 1, as 
Gr,{p,q,x) :=Ea=o,...,n5n,afeg)x^ whcrc 5„,ab,g) C) ( ^ l)""" ^^^g^^g^^^^ ■ In- 
deed, for ai and ao as before, one can prove that the sequence of modified Jaeobi poly- 
nomials, defined through the relation 



where 



(3) 

n 

= ^c„,q(q!i +q;o - 1,q;i)x°, n > 0, 

a=0 



, , ^ (2n + ai+ao-l)r2(2?i + Q!o + ai-l) 
n'.L [n + ai)L [n + aoji [n + ai + uq — 1) 



is such that J"^'"°{x)J^^'°'°{x)pai^ao{x)dx = or 1, according to whether m^n 
m ~ n, thus implying that the class { : n > 0} is a family of orthogonal polynomials 
associated with the weight function Pai,ao on the interval [0, 1]. This immediately yields 
that, for every n > 0, X G Jn{'n) if and only \i X = cJ^^'°'"{ri) for some real constant c 
and therefore that every F G L^{f]) admits a unique representation of the form 

oo 

F = E(F) + ^c„j;^-«°(^), (4) 

n=l 

where the real constants c„ are such that J^'^n < +^ (i-^-; the series on the right- 
hand side of (4) converges in L'^{ri)). It is not difficult to see (see Section 5 below for a 
complete discussion of this point) that, for every n, the random variable c„J"^'°'"{r]) can 
be (uniquely) written in the form J^^ 0„d-D®", where dZ)®" is the random product 
measure on {0, 1}" generated by the random probability defined in (2) and (pn is a well- 
chosen symmetric kernel on {0,1}". This implies, in particular, that every F G L^{i]) 
admits a decomposition as an infinite orthogonal sum of multiple random integrals, that 
is, 

oo „ 

F-E(F) + V/ 0„di?«". (5) 
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We will complete the example above in Section 5 by showing that the kernels 4>n have 
a natural interpretation in terms of {/-statistics. The connections between our results 
and other special polynomials in several variables are discussed in Section 6. Before that, 
we shall generalize the representations (4) and (5) by obtaining an analogous orthogonal 
decomposition of the space L^{D) associated with a DF process D, with an arbitrary 
parameter a(-) and defined on a general Pohsh space {A, A). 

1.2. Discussion of the main results 

Let a be a finite measure on the Polish space {A, A) and let _D be a DF process of 
parameter a. To obtain our main results - and to be able to use the theory developed 
in [18] - we shall suppose that the law of D is the de Finetti measure, that is, that 
D is the directing measure of an infinite exchangeable sequence X = {X„:n > 1} of 
random variables with values in (A, A) . This means that the sequence X is defined on 
the same probability space as D and that, conditioned on D, X is composed of i.i.d. 
random variables with common law equal to D (see [2] for an exhaustive discussion of 
this point). Note that, given a general random probability measure Af (•; w), there always 
exists (on a possibly enlarged probability space) an exchangeable sequence Y such that 
M is the directing measure of Y. Then, according to, for example, [4], X must be an 
infinite generalized Polya urn sequence with parameter a, as defined in the above-quoted 
reference and in Section 2 below. Note that, in this case, D is automatically the a.s. limit 
of the sequence of empirical measures generated by X. 

The principal achievement of the present paper is to prove (Theorem 1) that every 
F E L^{D) has a unique representation of the type 

F = E(F) + V / V.„)(ai,...,a„)Z?®"(dai,...,da„) 

- „ (6) 
= E(F) + V/ 

where _D®" indicates the n-dimensional (random) product measure associated with D, 
the series converges in and the kernels h(^p^n)^ n'>l, are deterministic, symmetric and 
such that, for every n, 

E(V,„)(X„)')<+oo and E(/i(j.,„)(X„) | X„_i) = 0, P-a.s. (7) 

Here, X„ = (Xi, . . . , X„) represents, for every n > 1, the first n instants of the Polya 
sequence X introduced at the beginning of this subsection. Consistent with the notation 
of [18] and [16], Chapters 9 and 10, and for ?i > 1, the class of symmetric functions h on 
A" satisfying condition (7) is denoted S„(X). The functions h/^p n) G 2,i(X) appearing 
in (6) may be interpreted, for every n, as completely degenerate kernels of symmetric 
[/-statistics (sec, e.g., [11]) based on a truncation of the sequence X. As a consequence 
(see Proposition 3 below), an apphcation of the results contained in [18] yields that the 
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sequence J^,, h^p^^) dl?*^", n > 1, appearing in (6) enjoys the following isometric property: 
for every n, m > 1, 

e(^J^ V,„)di?®"^ =f™,«xcKa(^))E(V,")(X„)2), (8) 

where c(n, ct{A)) := HlLil^ ^ ' + + + ? — 1) and e,„ „ equals or 1 according 

to whether m ^ n or m ~ n. As anticipated (see Proposition 5 below), a random vari- 
able of the type hD®"- , h (X) , represents the infinite-dimensional analogue of 
the modified Jacobi polynomial introduced in (3). Note that formula (8) determines an 
isomorphism between L^[D) and the orthogonal sum 

Vc(n,«(A))S„ (X) ~ ^c{n,a{A))SHn (X„ ) , (9) 

Ti>0 n>0 

where "~" indicates a Hilbert space isomorphism, S'iJo = ^ and S'iJ„(X„) is the nth 
symmetric Hoeffding space associated with the finite Polya urn sequence X„ (see [18], 
Section 3, and [17], as well as Section 3 below). More to the point, a recursive formula is 
given (Theorem 2) to explicitly calculate real coefficients :n > 1, 1 < /c < n} that 

depend uniquely on a{A) and satisfy the relation 

n 

V.n)(ai,...,a„)=E^'"" E nF-E{F)\X,=a,„...,Xk = a,J (10) 

fe=l l<il<---<ifc<n 

for every F G L'^{D). It is worth noting that (10) is quite explicit since, according to, 
for example, [8], Theorem 1, for every A: > 1 and every (ai,...,afe) € , the law of 
D under the conditioned measure P(- \ Xi ~ ai^ . . . ,Xk ~ Ok) is that of a DF process 
with parameter a + X]i=i ^a.i , where 6a indicates the Dirac mass at a. Such a stability 
property of the class of DF processes is usually summarized by saying that DF processes 
are conjugate (see [20]). Also, observe that, unlike other chaotic decompositions, to obtain 
the explicit formula (10), we do not need any regularity assumption on F (sec, e.g., [23] 
for Wiener chaos, where the regularity assumptions arc related to weak differentiability, 
in the sense of Shigekawa-Malliavin) . 

1.3. Some motivations from Bayesian nonparametric statistics 
(Bayesian estimation of conditional variances) 

Our results contain, as special cases, several computations from [8], Sections 4 and 5, 
and therefore have some immediate applications to (nonparametric) Bayesian statistical 
decision problems. As an illustration, consider the following setup (see [8] for further 
details). We are given a sequence of random variables X = {Xi-.i > 0}, modeling the 
observations of a random phenomenon with values in {A, A), such that Xq =xo € A and, 
conditionally on the realization of a random probability measure D, the sequence {Xi : i > 
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1} is i.i.d. with common law equal to D. We shall use the notation {Xq, Xi,. . Xn) = X„, 
n > 0, and suppose that D has the law (known as the prior distribution) of a DF process 
with parameter a, with a{A) < +oo. In particular, the measure a (which completely 
determines the law of D) is a mathematical representation of the initial information of 
the observer. We also consider a functional F G L^{D) with the form of a conditional 
variance, that is, 



2 



F = V{h) = / h{a)^D°°{dsL) - f / /i(a)D°°(da) 

= E[(/i(X) -E[/i(X) I L*])^ I D], 

where h : A°° ^ is such that E(/i(X)^) < +00 and is the canonical (infinite) product 
measure induced by D on . Note that, except in trivial cases, F is a function of the 
whole (infinite) sequence X: it follows that its value cannot be inferred from any finite 
set of observations. Given the sample x„ = (xq, xi, . . . ,Xn) of the first n + l observations 
(71 > 0), we therefore face the following decision problem: provide an estimation of F 
by choosing the square-integrable statistic /iy(/j)(x„) that minimizes the (conditional) 
expected square loss 

L(V(^);a;x„)=E[(i^- V(,,)(X„))' | (Xq, . . . , X„) = x„]; (11) 

observe that, with this notation, for n = 0, 

L{hv{hy,o:;xo) = L{hv{h)':a;xn) = mV{h) -hv{h){xo)f]- 

Since the conjugacy of DF processes implies that, under the probability P[- | 
{Xq, . . . , Xn) = x„] , D has the law of a DF process with parameter ax„ a + J2'j=i , 

for any choice of a and x„, we have L(hy(^fiy, a; x„) = L{hy(^fiy, ax„; 2:0), with the Dirac 
mass in x. Elementary computations therefore yield 



/ h{a)'^D°°{da)~( f h{a)D°^{da] 



2- 

hv{h){^n) = E 



(12) 

= E[/i(X)2 I (Xo, . . . ,X„) = x„] - E[Vih)^ I {Xo, ...,Xn)= x„], 

where _D is a DF process with parameter ax„ • Now, consider the following decomposition 
of V{h) under the measure P[- | (Xq, . . . , Xn) = x„]: 



00 ^ 

V{h)^E[h{X) I (Xo,...,X„)=x„]+^ / h['lr^l^,^^{ai,...,ak)dD'^''{ai, 



,ak) 



where the kernels ^(y"()i) fc) can be obtained by applying formula (10) to a DF process 
with parameter ax„. We can conclude from (12) and the orthogonality of the h^(y\\-) tc) 
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that 

hv{h){^n) = Var[/i(X) | {Xq, . . .,X„) =x„] 



(13) 

-^c(fc,a(A)+n) xE[/ij;^"(|^)^^)(X„+i,...,X„+fc)2 | (Xq, . . . , X„) = x„], 
fc=i 

where Var[- | {Xq, . . . ,Xn) = x„] stands for the conditional variance. We stress that the 
kernels ^[y'^jj) in (13) are explicitly known, due to formula (10). Moreover, it will 
become clear from the subsequent analysis that if V{h) ~ E[(/i(X„j) — E[/i(Xm) | D])'^ \ D] 

fx ) 

for some 1 < m < +oo, then k) ~ ^ ^'^^ k > m and therefore the right-hand side 

of (13) is just a finite sum. In particular, formula (13) generalizes the computations 
contained in [8], Section 5(e). For instance, if h{a) = h{ai) (so that V{h) ^ J^^h^dD — 
(Jj^hdD)^), then (13) reduces to the well-known formula (see, e.g., [8], page 226) 

hv{h) (x„) = " ^ Var[/i(X„+i) | (Xq, . . . , X„) = x„]. 

a(A) + n + I 

1.4. Further remarks and organization of the paper 

As discussed below. Theorems 1 and 2 represent a logical continuation of the results 
contained in [18], Section 5, where we obtained the explicit Hocffding-ANOVA decom- 
position for symmetric statistics of vectors of exchangeable observations that arc (finite) 
Generalized Urn Sequences (GUS). The class of GUS contains, as special cases, vectors 
of i.i.d. random variables, as well as extractions without replacement from a finite popu- 
lation and truncated Polya urn sequences. The results of this paper are mainly obtained 
by properly extending to the infinite-dimensional case the content of the above-quoted 
reference. 

The analysis contained in this work is also related to another statistical problem. 
Supposing that we arc given a vector {Xi, . . . , Xn) of exchangeable observations that are 
the first n instants of the infinite Polya sequence X. which is the best approximation 
of a generic element of L^{D) by means of ?7-statistics that are based exclusively on 
{Xi, . . . , Xji) and how can one compute the corresponding quadratic error? Of course, 
one can ask the same question for a general infinite exchangeable sequence and for any 
square-integrablc functional of its directing measure. In the last section of the paper, we 
shall show that a general solution to this problem is contained in formulae (8) and (10) 
above, as well as in the calculations performed in [18]. 

The paper is organized as follows. In the next section, we recall some results about 
Dirichlet processes and exchangeable sequences which are due to Blackwell and Mac- 
Qucen. In Section 3, some preliminaries about urn sequences and Hoeffding-ANOVA 
decompositions arc presented. Section 4 contains the statements and proofs of the two 
main theorems of this work. In Section 5, we complete the study of the simple Dirichlet 
process introduced in (2) and establish, in this case, an explicit relation between Jacobi 
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polynomials and the multiple random integrals appearing in (6). In Section 6, several 
connections are discussed between our decomposition of the space L^{D) and the family 
of generalized Appell-Jacobi polynomials used in [12] and [10] to make explicit the transi- 
tion density of a Wright-Fisher diffusion process. Section 7 contains further applications 
and examples. 

The results of this paper have been partially announced in [17]. 

2. Blackwell— MacQueen construction of the Dirichlet 
process 

The main idea of [4] is that a general Dirichlet process can be represented as the limit of 
the empirical measures associated with an infinite, exchangeable sequence of observations 
and that such a sequence can be taken to be a generalization of the so-called Polya urn 
scheme. The notation of the previous section is maintained throughout the sequel. 

Suppose that a sequence of random variables X = {Xn : rt > 1} is defined on the prob- 
ability space {n,T,V), taking values in {A, A) and such that, for every fc > 1 and every 
1 < ji < • • • < jfc < +00, 

P(X,, e da„ . . . , X,. e da.) ^ ]j "(d«0 + E;:lMda.) _ ^^^^ 

I— 1 ^ ' 

We can think of X as an infinite sequence of extractions from an urn A, whose initial 
composition is given by the measure a, according to the classic Polya scheme: at each 
step, a ball is extracted and two balls of the same color are placed in the urn before 
the next extraction. It is also clear, due to (14), that X is an infinite exchangeable 
sequence, in the sense that its law is invariant under finite permutations of the index set 
{1,2, . . .}. We call X an (infinite) Polya sequence with parameter a. In the terminology 
of [18], Section 5, we have that, for each N >2, the vector Xjv = {Xi, . . . ,Xj^) is a 
Generalized Urn Sequence (GUS) of length N with parameters a and c = 1. This implies 
that, for every fc > 1 and every (ai, . . . , a.) £ A'', under the conditioned probability P(- | 
Xi = oi, . . . ,Xk = Uk), the sequence {Xk+n : ^ 1} is an infinite Polya sequence with 
parameter a + X]f=i ^ai ■ The next result, which is proved in [4], states that the sequence 
of empirical measures generated by X converges almost surely to a DF process with 
parameter a and that the law of such a process coincides with the de Finetti measure 
associated with X. 

Theorem (Blackwell and MacQueen [4]). Using the previous notation for every n, 
define 

1 " 

P„(C;c^) = -Vlc(X,(u;)), C e A, 
n ^-^ 

to he the empirical measure associated with the vector X„ . Then, 
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(a) as n goes to infinity, the random measure P„(-;a;) converges P-a.s. to a random 
discrete probability D{-;uj) on (A^A); 

(b) the measure D appearing in (a) is a DF process with parameter a; 

(c) given D, the variables Xi, X2, ■ ■ ■ composing the sequence X are independent and 
identically distributed with law D. 

By inspection of the proof contained in [4] , the convergence in item (a) of the previous 
theorem can be interpreted in the following sense: there exists f2, £ such that P(ri,) = 1 
and, for every w G $7*, 

P„(C;w) — >D{C\u) VCeA 

This implies that, almost surely, P„ weakly converges to D. The reader is also referred 
to [19], where it is shown that weak convergence may be replaced by convergence in total 
variation. Also, note that dominated convergence implies that, for any measurable set 
C, E[(P„(C) — D{C)Y] ^0. In the classical terminology of [8], item (c) of the previous 
theorem states that, for every fc > 1 and every Ji ^ ■ • ■ 7^ jk , the vector {Xj-^ , • ■ • , Xj^, ) is a 
sample of size k from D, where _D is a DF process appearing as the limit of the sequence 
P„, n > 1. From now on, when considering a DF process D with parameter a, we will 
always assume that such a process is the a.s. limit of the sequence P„ associated with a 
Polya sequence X with the same parameter. 

3. Urn sequences and HoefFding— ANOVA 
decompositions 

In this section, we recall the results of [18] and [16] that are related to generalized Polya 
urn sequences. We start by introducing some notation, mostly borrowed from the above- 
quoted references. 

Fix > 1. For any n £ {0, 1, ... , N}, we define 

VN{n) := {k(„) = (fci, . . . , kn) :! < ki < ■ ■ ■ < kn < N}, 

where k(o) := and VAr(O) = {0}, and also T4o('t-) = UAr>i ^Jv('t-)- For n > m > 1, e 
Vooi'm) and k(„) G Voc{n), l(m) A k(„) is the set {U : U = kj for some j — 1, . . . ,n} written 
as an element of Vao(r), where r :— Card{l(m) A k(„)}. Analogously, for any ri,,m > 0, 
k(n)\l(m) denotes the set k(„) n (I(to))'^ written as an element of the class Vco(n — r). 
Finally, given kj^j-j g Vooin) and a vector h.(^,n) = {hi, . . . , hm), by h(^) C 'k(^n) i we mean 
that h(„j) G Voc{fn) and that for every i G {1, . . . , m}, there exists j £ {1, . . . ,n} such that 
hij - — - hj^ . 

Now, consider a Polya sequence X such as the one defined in the previous section. 
For any n > and every j(„) G Vodn), we write Xj^^^j = i^ji 7 • ■ • i ■^jn)^ with Xo = 0. We 
recall that the exchangeability of X implies that, for any < r < m < n and for any 
symmetric statistic T on A" such that E[|T(X„)|] < +00, there exists a function [T]i'^L 
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on A"^ , symmetric in the first r variables and in the last m — r variables such that, for 
every j(„) G Voo{n) and e Voo{m) satisfying Card(i(„) A j(„)) = r, 

E[T(Xj,,J I Xi,„,] = [T]M„(Xj,„,M,„,,Xj,„,^,„,), a.s.-P. 

In [18], we have provided a complete characterization of the symmetric Hoeffding spaces 
associated with the random vector Xj^^j for any > 1 and any j(7v) & Vao{N). More 
precisely, we start by writing L'^{'X.j^^^) for the Hilbert space of real- valued and square- 
integrable functionals of Xj^^^^^ and L'^(X.j^j^.^) for the subspace of i^(Xj^jyj) composed of 
symmetric functionals. Of course, for every j(jv), the space i^(Xj^j,^j) coincides with the 
space of square-integrable functionals of the empirical random measure generated by the 
vector Xj^„, . Wc eventually set i^(X) to be the space of square-integrable functionals 
of the sequence X (note that L^{D) C i^(X)). 

According to [18], the collection {SHi{X.j^^^),i — 0, . . . , N} of symmetric Hoeffding 
spaces associated with Xj^^j^j is defined as follows. Let S'?7o(Xj(jv) ) ■= ^ and, for i = 
l,...,iV, 

-i'(X) 

5t/.(Xj<„,):-v.s. <'r:T= " ' 



J2 5(Xj„),5(Xj,.,)GL2(Xj,,)l 

(i)Cj(lV) J 



where v.s. {C} is the minimal vector space containing C and 
<S'-ffo(Xj(„j), = S'C/o(Xjjj^j), 

5iJ,(Xj,„,) = 5c/.(Xj,„,) e ^f/._i(Xj,„,), z = 1, ... ,7V, 

where G denotes orthogonal difference between Hilbert spaces. The reader is referred 
to [18] and the references therein for more details about the use and interpretation 
of Hoeffding spaces. As discussed in [18], L^{'X.j^^^) differs from SUi{'Kj^j^^) for every 
J = 0, . . . , iV — 1. This means, in particular, that 

5[/,(Xj,„,)=05i/,(Xj,„,)Ci2(Xj^^^)^5,7^(Xj^^^)= 5iJa(Xj,„,). 

a<i a— 0,...,n 

Now, consider the measure a on {A, A) that determines the law of X. We shall use 
the following real constants: 

^( , , , U:it^''\c.{A)+r+p + s-l] 

«>(n,m,r,p) := (m-r)(„_^_p) _ ■ , (15) 

lls=i [<^(^) +n + s- 1\ 

where 1 <m < n, < r <m, < p < m — r, a{A) + n + m~r > 0, (a)(b) := a\/b\ for a>b 
and Yit^i — 1 = O*', by definition, and, for l<q<m<n<N, 

«'Ar(9,n,m):=^(^^^ (^^3"^ $(n,TO,r,g-r) (16) 
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with (^)^ := (^)l(a>;,). We also introduce the coefficients 

{O'^^'"'^ : 1 < fc < iV, 1 < a < fc} 
that are recursively defined by the set of conditions {SAr(fc), A: = 1, . . . , TV — 1} given by 



S^(fc) ■■ {j^Y^ e^^^^^^iq, k,j)=0, q^l,...,k-l, 



(17) 



and 



JN,a) 



'N 

further set 



Ef=a' ^N^'^^ for a = 1, . . . , iV - 1 and ^j^'"^^ = *w(7V, N, N)-^ = 1. We 



n(fc,Q) ._n(k.a) ( N -a 



'N* ■— "N 



k ~ a 



The following proposition stems from the main results obtained [18], Section 5. It 
provides an algorithm to project symmetric statistics onto Hoeffding spaces. 

Proposition 1 (Peccati [18]). Using the above notation and assumptions, /la; j(jv) £ 
Vca{N) and let T he a centered element of L^il^^^^^ ) . Then, for s = 1, . . . , N , 



j(s)Cj(jv) 



■a=l j(a)Cj(,) 



where 



and 



.a=l j(a)Cj(,) 



(18) 



TV 



a=lj(a)Cj(jv) 

Moreover, for every s, [(t'T^]^s'l-i0^i(^s-i)) ~ 0, a.s.-F, for a^^?/ £ Kx3(s^ 1) o,nd 
any < r < s — 1 . 

Note that such a result can be applied to non-centered T e Ll(Kji^^^) by considering 
the statistic T' ~T — E(T). The last relation in Proposition 1 implies that the sequence 
X is weakly independent, as defined in [18], Section 4. We now note a property of the 
coefficients 6[ ' ^ that will be very useful in the sequel and which can be proven by a 
standard recurrence argument. 
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Lemma 1. For every k = 1, . . . , N and every a= l,...,k, there exists a real number 
fl^*^^") such that 

For instance, by using the computations contained in [18], Section 6, we obtain 

6i(i'i' ==a(yl) + l; 

(20) 

= iaiA) + 3)(a(A) + 2); 9^^'^^ = {a{A) + 3){aiA) + 1) _ 

We stress that each of the coefficients O^J^'^^ can be calculated in a finite number of 
steps. Below, it will be proven that the coefficients 6'^'^''°) are those appearing in formula 
(10) above. To conclude, we present a characterization of symmetric Hoeffding spaces 
that is implicitly proved in [18]. 

Proposition 2. Using the notation and assumptions of Proposition 1, for any i = 

1.. ..,N, a centered random variable TGLg(Xjjjyj) is an element o/ 5'7Ji(Xjjjyj) if and 
only if there exists ft.'-*-' on A* such that (1) /i^*-* g Si(X) and (2) 

Moreover, the function h^^^ is unique in the sense that if /i''*-* also satisfies conditions 
(l)-(2) above, then, P-a.s., 

/i«(Xj,J = ft'W(Xj,,,). 

Note that the first part of the statement of Proposition 2 implies that the sequence X 
is Hoeffding decomposable, in the sense of [18], Definition 1. 

4. Main results 

4.1. Preliminaries on multiple integrals with respect to DF 
processes 

Let _D be a DF process with parameter a. We shall explore the properties of objects of 
the form 

/ /i(")di:'®", (21) 

where /i*^"^ is a real- valued function on A"- such that 

E[ft(")(X„)]=0 (22) 
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and 

/i("Hx„)eL2(X„), (23) 

where, as before, X„ represents the first n mstants of the associated Polya sequence with 
parameter a. 

Remark. Note that considering muhiple integrals of symmetric functions is not a re- 
striction. As a matter of fact, let be a measurable and not necessarily symmetric 
function on A". The symmetry of the product measure then yields that 



where 

/'^"^ (ai , . . . , a„) = ^ E Z^"' (a<T(i) , ■ • ■ , at7(n) ) 

a 

and tr = (cr(l), . . . , (y{n)) runs over all permutations of (1, . . . , n). 

Objects such as (21) define square integrable random variables whose variances and 
covariances are explicitly known. To see this, we use the notation of the previous section 
and write 

n n 

/i(")(X„) = E'^[^^"^^^^KX„)=E E <^it(Xj(.,), P-a.s., 

where 7r[/i("\ S'i/^] is the projection on the sth symmetric Hoeffding space generated 
by X„ and the function ^j^'ji) £ Ss(X) is given by applying formula (18) to the r.v. 
/i^"-'(X„). regarded as a symmetric element of L^(X„). This immediately yields the 
following identity, again due to the symmetry of the product measure 



(24) 



Proposition 3 (Covariance between multiple integrals of the same order). 

For n>l, let and ft,*^"' be real-valued and symmetric functions on A" satisfying 
conditions (22) and (23). Then, 



E j^ J^i-UD®- jji-^AD®A^ = E (;) ' n ^ [\\ (X.)4t, (X.)]- 
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Proof. This is a consequence of Corollary 8 in [18]. Start by writing 



E 



EE 

n n 

EE 



E 



E 



s=l t=l 

and observe that 13 is the directing measure of X, yielding that, for every s, t, 



E 



E[0i*(l,(Xj,J</.(t(Xi^J] 



if < 7^ s, 



n 

(=1 



s-l + l 



E\ 



a{A) + s + l-l '^f^ 



where £ Vcx3(s), i(t) G K3o(i) and Card(j(s) Ai(t)) = 0, due to the fact that 0^ 
5s (X) for every s, as well as Corollary 8 in [18]. 



is) As) 



□ 



By choosing = h^"\ we immediately obtain that random variables such as (21) 
are square integrable. Note, moreover, that Proposition 3 contains, as a very special case, 
the classic computations of [8], Theorem 4. We now show that multiple integrals of the 
same order are almost uniquely defined. 

Proposition 4. Let /i*^"' and be symmetric functions on A" that satisfy (22) and 
(23). The condition 



fin) d£)®"^ 



-a.s. 



A" 



then implies that, for every j^n-^ £ Voc{n), 

/i(")(Xj,,J = /(«)(Xj,„,), F-a.s. 

Proof. We first consider the case where both /i^") and J^"-* are elements of S„(X). To 
prove the result in this case, we simply observe that the assumptions and Proposition 3 
imply that 



E 



E 



E 



/(") dD 



On 
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= c(n,a(A))E[/i(")(X„)/("HX„)] 

-c(n,a(^))E[/i(")(X„)2] 

= c(n,a(A))E[/(")(X„)2], 
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where, for any n > 1 , 



=(n,a(A)):=n 



n-l + l 



^ a{A) + n + l 



- 1 



>0, 



(25) 



thus yielding 



E[(/^(")(X„)-/(")(X„))V0. 
Now, given general /^"^ and as in the statement, we write 



j h^^'^ AD®^ = yJ^^ j 0[f„,d7?«^ and j /("^ di?«" = ^ ^ ^t) d^®'. 

using the notation of formula (24), so that the relation 

/j(")dD®"= [ /(")dL>®", P-a.s, 



implies that 



= E 



Y.[ :] c(5-«(^))iE[(C(,o -4f„,)'(x„)] 



This immediately yields 



i>|^f„,(X„) = 0^(i)(XrO, F-a.s., 



and therefore, P-a.s., 

n n 

/i(")(x„) = ^ ^ 0(;t(x„) = E E 4t(x„) = /("Hx„)- 



■■'=ij(.)ev'„(s) 



s=ij(.)ev„(s) 



□ 



We now introduce the following class of subspaces of L?{D): M.o{D) — 5R and, for every 
n> 1, 



Mn{D)^!^Y eL^{D):Y^ J /i*") dD®" and /i^") G S„(X)|. 



(26) 



By Proposition 3, it is immediate that, for any n, the set Ain{D) is an L^-closed 
vector space isomorphic to a/ c(n, a(A))S„(X), which is, by definition, isomorphic to 
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c(n, a(A))S'iJ„(X„), that is, to the nth symmetric Hoeffding space associated with 
X„, endowed with the modified inner product c{n,a{A)) x (■,-)l2(x): where c{n,a{A)) 
is defined in (25). Moreover, A4n{D) 1. Aik{D) in L^{D) for every fc ^ n. As a matter of 
fact, if we let n > fc and suppose that /i^") e S„(X) and /i''"'' G Sfc(X), then 



E 



E 



= E[/i(") (Xi , . . . , X„)/i('=) (X„+i , . . . , X„+,)] 
= E[E(/i(") (Xi , . . . , X„) I X„+i , . . . , X„+fe) 

x/iW(X„+i,...,X„+fe)]=G. 

Proposition 4 ensures that every element A4n{D) admits a unique representation as 
an integral of an element of S„ (X) with respect to . 

Remark. In general, if a random variable Z G L^{D) admits a representation of the 
type Z = /^") d_D®" , where is symmetric and satisfies (23), then Z can be also 

written as Z = X4,.+i /^"^^^ dD®"+i, where 



-f(»+1)/ n 

/ (ai,...,a„+i) 



(n + 1) 



^ • . ,a^(„))lA(a^(„+i)) 



and <7 runs over all permutations of the class {1, . . . , r; + 1}. However, the orthogonality 
relations discussed above ensure that if there exist /"^"^ G S„(X) and g S„+i(X) 

such that 



then, necessarily, Z = 0, P-a.s., and therefore, by Proposition 4, /"^"^ ~ = q. This 

also implies that if 



)n+l 



where h'^^\ k~n,n + l, are symmetric and satisfy (22) and (23), then, necessarily, 
7r[/i("+i),5i/„+i](X„+i) = 0, P-a.s. 



4.2. Chaotic decomposition of L^(D): L/-statistics and 
polynomial complexity 

We shall now show that the sequence {A4n{D) :n > 0} defines an orthogonal decompo- 
sition of the space L?{D). To see this, we state and prove a simple result. 
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Lemma 2. On a probability space {S, S, Q), let iy(-;Lo) be a random non-negative measure 
on {A, A) and denote by L'^{v) the class of square integrable functionals of v . Suppose 
that there exists q > such that 

sup i/(C; ■)< g, Q-a.s. (27) 

ceA 

Then, the class of random variables, 

I flHCj))''' : n > 0, Ci, . . . , C„ e yl, 1 < fci < • • • < fc„ < +oo| 

is total in L'^iv). 

Proof. We simply note that (27) implies that, for every Ci,...,C„ G A and every 
(Ai,...,A„)g5R", 



exp^A,;.(C,) UL2- hm \{ 
\j=i J "-'+^,=1 



V (A,KQ))' 
^ i\ 

1=0 

2 



and that r.v.'s of the type exp(^j^-^ Xiu{Ci)) are trivially total in L {v). □ 

We now come to the main result of this subsection. 

Theorem 1. Let D be a DF process of parameter a. Every F e L'^(D) then admits a 
unique representation of the type (6), with h(^pn) S S„(X) for every n>l. 

Proof. Due to Lemma 2, and since D is a random probability measure, it is sufficient 
to show that random variables of the form 

n 

F = l[{D{C,)p, Ci,...,C„gA 

admit the representation (6). But, 

F = E{F)+ f /(c„...,c„)d^«^", 
where Kt := ki + ■ ■ ■ + kt for f = 0, . . . , n (of course, ~ 0) and 

^ n Kj 

fiC„....C„)iai,...,aKj:=—f_J2Y[ [] Ic, (a.(i)) - E(i^), 
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with (7 running over all permutations of (1, . . . ,Kn). We now apply formula (18) to the 
function /(Ci,....c„) obtain 

which implies that F admits a decomposition such as (6), with 

for s < K„ and /i(f,s) = for s > A'„. The general result is achieved by using a standard 
density argument as well as the fact that each A4s{D) is an L^-closed vector space. □ 

Remarks, (a) Since D is the directing measure of the sequence X = {Xn - n > 1}, for 
every n>l and every £ L^(X„), we have ft.(„-) dZ?®" = E[/i(-„') (X„) | Z?] . It follows 
that, for every F e L^{D), formula (6) can be rewritten as 

F = E(F) + ;^E[V,ro(X„) (28) 

n>l 

(b) We may obtain a representation similar to (28) by using elementary martingale 
theory. Indeed, consider a random variable H G L^(X), as well as the filtration Xn = 
(t(X„), n > 0. It is clear that the process Yn = 'E[H \ Xn] is a square-integrablc A'„- 

martingale such that Yn ^ H as n ^ +oo. Now, define g(^H,n)0^n) = E[iZ | Xn] — K[H \ 
Xn-i], n > 0, so that we obtain immediately that 

n>l 

where the series on the right converges in L^(X). As a consequence, by conditioning with 
respect to Z?, we obtain 

E[H I D] - \D]] = Y, n9iH..n) (X„) \D]=Y, [ 9(H.n) dZ?®"- (29) 

n>l n>l"^^" 

However, the above representation of E[iZ | D] is not "chaotic" in the proper sense. 
As a matter of fact, since the kernels g[H,n) are, in general, not symmetric, the in- 
tegrals (7(^_„-) d-D®" appearing in (29) may be not orthogonal in L^{D), although 
K[g(^H.n)0^n)g(H,m)0^m)] — 0, for m ^ n. To see this, simply consider H = /i(X2) such 
that 7r[iZ,5iZi] ^0. 

In what follows, we give two characterizations of the spaces Mj{D): in terms of poly- 
nomial complexity and in terms of U -statistics based on finite samples of the underlying 
sequence X. Both are related to the following result. 
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Lemma 3. Let /i„ he a symmetric and finite measure on the product space {A^,A"), 
n>2, that is, 

for every Ci, . . . , Cn S A and every permutation a of {1, ... , n), and denote by L'j.{^n) 
the space of symmetric and measurable functions f on A" such that 



^ 2 

A 



f d^n < +00. 



The class 



Hn := yi-.hiai, . . . ,a„) = ^Y[lc^^^-,iaJ),Cj £ ^, j = 1, . . . , ?i| , (30) 
where a in the summation runs over all permutations of {\, . . . ,n} , is then total in 

Proof. Consider an element g e L^(/^„) such that g _L Tin- This means that 

.gnic.<,,dM„-0 

for every Ci, . . . ,C„ E A. Now, the left side of the preceding formula equals 

due to the symmetry of /x„ and g. However, functions such as IljLi '^Cj are total in 
L^{lJ,n), that is, the space of square-integrable (and not necessarily symmetric) functions 
on A". This implies that g = 0, /x„-a.s., and the result is therefore proved. □ 

In what follows, for every DF process D, we define Vo{D) :~ and, for > 1, we set 

— p f^-^ 

Vn{D) := Y.&.\^{D{Cj)Y^ : 71 > 0, Ci, . . . , C„ e A % > 1, K,, < 7V| 

where Kn = '}2j=i n^h ^^'^ closed vector space generated by the polyno- 

mial functionals of D with order less than or equal to TV. Note that the sequence 
{Vn{D) ■ N > 0} generalizes the class {7'„(??):n > 0} defined in the Introduction. In 
particular, Vn{D) C Vn+i{D) for every N >0 and, again be to Lemma 2, the union of 
the Vn{D)'s is dense in L'^{D). Consistent with the notation introduced above, we will 
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also write Jo{D) := K and, for N>1, Jn{D) := Vn{D) fl VN-iiD)-^. The next result 
shows that the elements of A4n{D) arc the analogues of Jacobi polynomials for general 
DF processes. 



Proposition 5. Using the above notation and assumptions, for every N >0, 

N 

Jn{D)^Mn{D) and ^ Mi{D) ^ V n {D) . 



(31) 



i=0 



Proof. As a by-product of the proof of Theorem 1, we know that every element of Jn{D) 
also belongs to ®i^qMi{D). Now, consider, for simplicity, a centered F £ L^{D) such 
that 



N 

1 J A" 



with h(^p n) e S„(X) and suppose that F L Pn{D). This implies, in particular, that, for 
every C & A, 



= E 



h^FA)dD I Ic'dD 



= E[/l(j.,i)(Xi)lc(X2)] 

= E[/i(f,i)(Xi)(lc(X2) - F{X2 G C))] 

= c{l,a{A))E[h^FA){Xi){lc{Xi)-P{XieC))] 

= c(l,a(A))E[V,i)(^i)lc(^i)], 

where the coefficients c{n,a{A)), n > 1, are given by (25), thus yielding h(^p i'j{Xi) = 0, 
P-a.s. We now use a recurrence argument. Suppose that we have shown that F ± Vn{D) 
implies /i(Fj") ~ for every j 1, . . . , n — 1, where n < N. We then have, necessarily, for 
every Ci , . . . , C„ £ ,4, with the same notation as in the previous section, 



= E 



A" 



h^F,n)dD^-l[D{C,) 



= E 



dD 



0n 



where, with the usual notation, 
and therefore 



(ai 



1 " 



a j = l 



= E 



h(F,n){Xl, ■ ■ ■,Xn) X TT 



(^n+1, • • ■ , X2n) 
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:c(n,a(A))E 
c{n,a{A)) 



L \j=i 



{Xi,. . .,Xn) 



E 



thus implying, by Lemma 3 and the fact that the law of (Xi , . . . , Xn) induces a symmetric 
measure on A", /i(^„)(X„) = 0, P-a.s. The proof is completed by means of standard 
arguments. □ 

As announced, we have a second representation for the family {A^„(_D) : n > 0}. 

Proposition 6. Using the previous notation, let D be a DF process with parameter 
a and consider the associated Poly a sequence with the same parameter a. For every 
N >1, the space ^fLo-^ii^) ~ 'Pn{D) is then generated by random variables with the 
representation 



hen 



(32) 



where the family Ti.^ is defined as in (30). 



Proof. We first prove that if Y satisfies (32), then Y = J^j^hdD'^^. It is sufficient 
to prove such a claim for N = 2 and the general case can be achieved by a standard 
recurrence argument. To see this, simply choose 

h{ai,a2) = [Ici(ai)lc2(a2) + Ici (a2)lc2 (ai)]i 
where Ci, C2 £ A, so that 



1 , 2 

-Y = L^- lim — , 

j(2)ev'if(2) 



K 



= ^\ii-.]^ E iMX,,)+X^-^lim^^Elc.(^.)lc.(XO, 

j(2)eVK(2) i=l 

where the last equality follows from Blackwell-MacQueen theorem and therefore 
= hm [^cAXnncAX,,J + lcAXj.nc.iX,,)] 



j(2)6V/f(2) 
K 



K^-\-oo A ^ — ' 

2—1 
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~^ °° \l<jl^32<K i=l J 

\i=l i=l / 



Ic.lc.dD^^^^ I hdD^ 



1 

To complete the proof, we simply observe that an appUcation of Lemma 3, similar to 
the one performed in the proof of Proposition 5, impUes that random variables of the 
type 



L 

are total in ®;^o ^'^'^ every N >1. □ 

The following consequence of Proposition 6 will play an important role in the sequel. 

Corollary 2. For every N >\, the space A4n{D) is generated by random variables of 
the type 



Proof. We use formula (18) to write, for a given h G Hn, the following identities: 
F^J hdD^^ ^E{F) + J2(^^^) J 

and 

It is now clear that, since such r.v.'s as 

hdD®^, heHN, 
are dense in ©;io-^/(-D)> for 

every i < N, objects of the form 
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where the (j)^^^ are defined via (18), are dense in M.i{D). To demonstrate the result, it is 
therefore sufficient to prove the foUowing relation: for every i > 1 , if 5 G (X) , then, for 
every G 14o(j - 1), 



E 



= 0, 



-a.s. 



(34) 



As a matter of fact, this would immediately imply that 



dD'^' = L'- lim 



\N-i) 



j(>)eyK(0 



To prove (34), we simply use the identity 



E 











= E 


I gAD®' 







where D is a DF process with parameter a + X^^i i-i and also 



E 



gdD 



- E[g{Xi, . . .,X2i-i) I Xi = xi, . . . , = = 0, 



where the first equality comes from the fact that, under the probability P[- | Xi = 
= .T,;_i], the sequence {Xi^i^k'k > 1} is a Polya sequence with parame- 
ter a + J2j=i i-i and the second follows from the fact that g G Si(X). □ 

Remark. A random variable such as Z = (^) X]j(jv)eV'/c(-;v) ^(-^j(jv))' where h is as in 
(33), is said to be a U-statistic with (completely) degenerate kernel h of degree N, based 
on K observations. The kernel h is called "degenerate" since it satisfies the condition 
E[/i(Xjv) IXat-i] =0, P-a.s. 



4.3. Explicit formulae 



Now, consider the coefficients 9''^'°-\ fc > 1, 1 < a < fc, that are defined in formula (19). 
The following result provides a way to project functionals of D onto spaces of multiple 
integrals. 
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Theorem 2. Using the notation and assumptions of the present section, suppose that 
F G L^{D) admits the decomposition 

F = E{F) + y] f V,n)d^®". 

where h(^p^^-^ G E!„(X). For every n>l, h(^pn) must then satisfy equation (10) outside a 
set of measure zero with respect to the probability induced on A" by the vector X„ . 



Proof. For simplicity, we consider F such that E(F) = 0. Moreover, by density, it is 
sufficient to demonstrate the resuh when 

F= I ,/)f'dD®^, 

h £ Hn and (f))^^ <E SAr(X) is given by formula (18). In this case, h(^p„-j = for n ^ iV 
and h(^F.N} ~ ■ For n < A^, it is true that 

n 

Q^Y^gin^ J2 E{F\X, X,,), a.s.-P, 

k=l l<]l<---<3k<n 

since we know from the discussion contained in the proof of Corollary 2 that 

e(^J gdD^^ \Xi,...,Xk^ ^0, a.s.-P, 
whenever g g SAr(X) and N > k. When N, we shall use the relation 
/ 0f)di?«- = L^ .lim -1- 0r(Xj<„,) 

= L^- lim J^a-(Xk) 

K-+ + 00 

that was implicitly proven in Corollary 2. Now, Fk{^k) S SHn{^k), and we may apply 
formula (18) to obtain, for every jn S Vk{N), 

^4^)(Xj<„,)==5:c") y: E{FK{XK)\Xi^j 

a=l j(a)Cj(JV) 

and therefore 



N 



a=l 



e^(Xj<„,)=EC''M A^j E E(i^K(X^) I Xj,„ 

j(a)Cj(N) 



Multiple integral representation for junctionals of Dirichlet processes 



115 



so that the conclusion is obtained in this case by letting K tend to infinity and using 
Lemma 1, as well as the relation 

E(F|Xj,^,) = L2_ li E(^^A'(XK)|Xj,„,). 

We are left with the case n> N. Here, we shall again use the fact that, for every K, 
FkO^k) e S'iJAr(XA-) so that 

n 

O = Y,0in.a) ^ E(F|Xj,„,) 
ti^l j(a)Cy„(a) 

= ^'-.l"^oo(n)^'^"'^' ^ E(F,(X,)|Xj,J, 

^ a=l j(a)Cj(„) 

is immediately proved since, due to the results contained in [18], Lemma 3 and Section 
= 7r[FK,SH,,]{XK)= 

j(„)6V'jc(ri) 



E 



g(n,a) 



E 



E(FK(XK)|Xj,„,) 



□ 



5. Beta random variables and Jacobi polynomials 
(conclusion) 

As a further illustration, we apply the results of the previous sections to the special case 
of the simple DF process on {0, 1} discussed in Section 1.1. 

Suppose that an urn contains ai red balls and ao black balls, where ai,ao > (we 
could also choose ai and ao to be non- integer, with a straightforward interpretation). 
As before, an infinite sequence of extractions is performed according to the following 
procedure: at each step, a ball is extracted and two balls of the same color are placed 
in the urn before the next extraction. We define X = {Xn : n > 1} to be the sequence of 
random variables defined as 

^ _ r 1, if the nth ball extracted is red, 
" [0, otherwise. 

It is clear that X is, in this case, a Polya sequence with values in {0, 1}°° and parameter 
ai(5i(-) +ao(5o(-), where Sx stands for the Dirac mass at x. Moreover, according to the 
BlackwcU-MacQueen Theorem, the associated DF process D has the form (2), where ri{uj) 
is a Beta random variable with parameters ai and ao. In what follows, to be consistent 
with the notation used in the Introduction, we will identify the DF process D with the 
random variable rj and will therefore write A4n{ri) instead of Mn{D), Vniit) instead of 
Vn{D) and so on. 
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We also recall that, for n > 1, the space S„(X) is defined as the class of symmetric 
functions (j) on {0, 1}" satisfying the relation E(0(X„) | X„_i) = 0, P-a.s., and, moreover, 
by easily adapting (26), the sequence Mniv) is, in this case, such that Moiv) = 3? and 



m=0 



Mn{v) = ^ye L'iv) ■■Y^Y.(Z) </'(l'"0"-'")r?'"(l -r;)"-™, G S„(X) , n > 1, 
where 



m times n~m times 

The following result, already announced in the Introduction, is a consequence of Propo- 
sition 5. 

Proposition 7. For every n > 0, F G Mniv) */ '"^'^ only if Y is a multiple of J"^'°'° {rj) , 
where the modified Jacobi polynomial J^^^"" is defined in (3). 

We now want to represent J"i'"" in the form j^^ ^dZ)*^", where D is defined in 
(2) and (p G S„(X). By Proposition 4, wc know that it is sufficient to write the equation, 
where we let Cn,a — Cn,a(cn + ao — 1, ai), as 



^ r; </^(r"0"''»)x™(l - X)"-" ^Y.^n,aX\ X G [0, 1], 
m=0 ^ ^ a=0 

which is satisfied if and only if <f> solves the (triangular) system 

E il) f",l'")(-ir-"'/'(l'"0— ) = c„,„ a = 0,...,n. (35) 



m=0 \ / \ / 

Theorem 2 and Proposition 7 yield the following result, related to formula (5) in 
Section 1.2. 

Proposition 8. Using the notation and the assumptions of this section the following 
conditions are equivalent: 

1. ^gS„(X); 

2. V = k4>, where k is a real constant and <f> solves the system in formula (35); 

3. there exists Y G L'^{if) such that 

n 

(A(X„) = ^0("-) ^ E(r|Xj^„,), p-a.s., (36) 

"=1 j(a)eV„(a) 

where the coefficients 9^'''"^ are given by formula (19). 
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In other words, Proposition 8 states that, in the case of a classic Polya urn and for 
every n, the only completely degenerate [/-statistics of order n are those constructed 
by means of kernels that are multiples of solutions of the system in (35). Moreover, 
conditional expectations of functionals of 77, with respect to the underlying urn sequence 
X, are linked to Jacobi polynomials via formula (36). 

Wc conclude the section by stating the following consequence of Proposition 3. 

Proposition 9. For every n>l, 

f j:-^'^°{xfp^,,^,{x)dx 
Jo 

_ TT n — I + 1 

ai + ao + n + I — 1 

where (f> is given by (35). 



J2 n U(l"0"-™)2^ x'"(l~x)"->„„„„(x)dx. 



6. Connections with other orthogonal polynomials 
and multiallele diffusion models 

In this section, we explain how our results can be related to Wright- Fisher diffusion 
processes (or, more generally, to Fleming-Viot processes) of population genetics. The 
reader is referred to [6], Section 10, [7] and the references therein for basic terminology 
and results. 

We fix, here and for the rest of the section, an integer K > 2. The K-type Wright-Fisher 
process (see also [10]) is defined as the diffusion taking values in the symplex 

A K = I C( K ) = ( Ci , ■ • ■ , ) : C. > , ^ C. = 1 1 

and with generator 

where ctj = 1 if j = i and = otherwise, and (qij) is the matrix describing the mutation 
structure. 

In what follows, wc will write 

^K-l = \ 7(K-1) = {ll,■■■,lK-l)■■l^>0, XI 7'' < W' 
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and, for any vector 6 = {6i, . . . , 9k) such that % > = 1, . . . , K), we denote by Dg a 
DF process on {1, . . . , with parameter ag{-) = k ^j^ji')^ where Sj is the Dirac 

mass at j. Note that, by definition, the vector 

Dej<-i = {Deiil}), ...,Dg{{K- 1})) (38) 

is a random element such that P{Dg^K-i G A5f_]^) = 1. We also recall that the law of 
De,K-i is absolutely continuous with respect to the restriction of the Lebesgue measure 
to A5f_j and that the associated density is 



/e,A--i(7(K-i))= r(f)\...r(f) \ ■■■^k-i 1 - 7j • (39) 




We write ^^(^k-i) to indicate the set of vectors n= (ni, . . . , n^-i) such that each 
Uj is a non- negative integer and, for each n £ 'W{k-i)j we use the customary notation 
|n| := K-i ^j- ^ system of biorthogonal polynomials 

with respect to the density fe.K-i introduced in (39) is a double collection of polynomials 
defined on A^_-^^, indexed by the elements of W(^k-i) a-nd such that (a) the degree of 
ria} is given by |ni| {i = 1,2) and (b) for every ni,n2 G ^{k-i 



(K-l) 

'7iV(')'(if-i))'^ni(7(K-i))/e,/<-i(7(A'-i))d7(/f-i) = \ \ 



0, ifni/n2, 
if ni = n2 



(see, e.g., [21] for further details). Note that, for every = (0i, . . . , 6k) such that 9j > 0, 
a complete system of biorthogonal polynomials with respect to fe.K-i can be obtained 
by properly renormalizing the double family of generalized Appell-Jacobi polynomials 
defined in [12], formulae (2.6) and (2.7). 

A crucial point in the analysis of the Wright-Fisher process defined by (37) is the ex- 
plicit computation of the associated transition density. This task can be hugely simplified 
by introducing the additional assumption 

5y = 2-^9j > yi<i^j<K. (40) 

As a matter of fact, in this case, the Wright-Fisher process has a unique stationary 
distribution given by the law of the DF process Dg on { 1 , . . . , if } , where 6 = {9i, ... ,9k), 
introduced above. In particular, according, for example, to [12], Section 3 and Section 4, 
and [10], Theorem 1, assumption (40) implies that the transition density of the Wright- 
Fisher process can be expressed in terms of a class of kernel orthogonal polynomials 
{Qra(') ■)'n'>0} which is uniquely determined by the following conditions: (i) Qq = 1; 
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(ii) for every polynomial of degree n>l and defined on A^^-^, 

n 

Rnh(^K-l))=Y^^[Qn{De^K-l-n(K-l))Rn{De^K-l)] V^f ^ ^K~l, 

where Dg^K-i is defined as in (38); (iii) for any complete set of biorthogonal polynomials 
{Vni,r]nh with respect to fe,K-i, 

Qn(7(/^-i);7U-i))== E vi^HliK-DW^Hf'iK-i)) (41) 

neW,jf_i):|n|=n 

for every n>l and every 7(_fs--i),7(x_i) G ^/c-i- 

As already pointed out, a complete biorthogonal system of polynomials such as the 
one appearing in condition (iii) is explicitly computed (up to normalization constants) 
in [12] by means of a generalization of Appell-Jacobi polynomials. Another approach 
for computing the kernel polynomials Q„ is used in [10], where the author uses the 
representation, valid for n > 1 and 7(a'-i)i7(k-i) ^ ^/f-i' 

Qn(7(K-l);7(A-l))= E ^n(7(K-l))^n(7(A'-l))' (^2) 
n : I n| — n 

where the family of orthogonal polynomials {Pnin e W(k-i)} is obtained through a 
Gram-Schmidt orthogonalization, with respect to fe,K-i, of the monomials 

Afn(7(A-i)) = 7^72^ • --iT-i^ n = (m, . . . ,nA-i) £ W(a_i), 

realized by means of a total ordering of the elements of W(^_i) (see [10], pages 311-315 
for details). In particular, the degree of each Pn is equal to |n| and, for n > 1, the set 
{Pn : |n| < 7i} is an orthonormal basis (with respect to the measure induced on by 
the density fe,K-i) of the space of polynomials of degree less than or equal to n. 

Another consequence of our results are further probabilistic characterizations of the 
orthogonal families of polynomials {?7n\\??n2^} and {Pn} appearing, respectively, in (41) 
and (42). In particular, we have the following proposition, whose proof is a standard 
consequence of Proposition 5 and the definition of the family {Pn}- 

Proposition 10. Using the assumptions of this section and the same notation as in 
Proposition 5, for every n>l and every F G L^{D0), the following assertions are equiv- 
alent: 

1. FeJn{De)=Mn{De); 

2. there exist real constants {cn : n G Wja-i); |n| = n} such that 

in particular, the set {Pn(-De a-i) : n G W(;^'_j), |n| = n} is an orthonormal basis 

ofMniDg). 



120 



G. Peccati 



Now, denote by the first n instants of the Polya sequence associated witli Do (see 
Section 2). It follows from Proposition 10 that there exists an orthonormal basis 

{/i„,e:neW(K-i),|n| = n} 
of the space y/c{n, |0|)S„(X^), where \6\ := J2f=i ^ji such that, for each n, 

Pn{Dg,K-i)= [ Vedi^r^ a.s.-P. 

For a fixed 7(/<-_i) = (71, 7/^-1) £ ^x-i, we define the measure on 
{1, . . - jK} as follows: 

=7», i = l,...,if-l, 

1=1 

For 71 > 2, we write fJ-^"^ to indicate the canonical product measure induced by IJ-y^j^_-^^ 
on {1, . . . , K}". Also, := A'-y(K-i) • Since 



dDf " = du®" . 

the following characterization of the kernels Qn defined above is now easily proved. 

Proposition 11. Let {77^^5^112'} ^6 complete system of biorthogonal polynomials as 
defined above and, for n >0, let Qn {-] •) be the kernel orthogonal polynomial defined by 
means of conditions (i)-(iii) above. Then, for every (7(k-i)j'7(k-i)) ^ (^x-i)^ outside 
a set of zero Lebesgue measure, 

Qn(7(X-l),7(X-l)) = Vn\l(K-l)Wr^\l'(K-l)) 
neW^K-l) ■■ |n|=n 



E / VedAi®" X / VedAi®" 



neW(jf_i):|n|=n 

One can now use, for example, [10], formula (3.1), page 316, to derive an expression 
for the transition density of a Wright-Fisher process with generator (37) in terms of the 
kernels /in,e- Namely, we have the following. 



Corollary 3. The transition density at time t > of the first K — 1 allele frequen- 
cies of the K-type Wright-Fisher diffusion process with generator (37) satisfying (40), 
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given a vector of initial frequencies = {1it ■ ■ tI'k) ^ such that " 

(7l,---,7K-i)eASf„i, IS 

Ph(K-iyt^i(K)) 

r(6'i H [-Ok) 0^-1 ex-i-iJi , ^.\r^ i ' ^l 

= r(gi)...r(gK) Y + 2^Pn{t)Qn{l(K^l)n(K-l))\ 



n=l 



^ r(6'i H h 6>jf) ejf-i-i 



/or a.e. 7(/^_i) G A^^ where Pn{t) ■= exp{ — ^n{n — l)t— ^{di + ■ ■ ■ + 6K)nt}. In partic- 
ular, if Dg is a DF process with parameter ag, then for a.e. 7(/^_i) G /S.^l^_^, the random 
variable 

= P{l(K-i) ; {De{{l}), Dg{{K}))) 

is an element of L^{Dg) and, for every n>l, the projection of G~^^^__^^ onto Mn{De) 
is given by 



r{0i H ^Ok) ei-i Bk-i-i 

X Pnit) 



\nG VV : |n| — / 



(43) 



We can finally combine (10) and (43) to obtain that, for every (ai, . . . , a„) G {1, . . . , K}" 
and every n>l, 

r(6'i H \-0k) 0,-1 Bk-i-I n f \ u /■ \ 

Y(0-) fWT^^ •••7k-i 2^ P„(7(K-i)) X Ve(«i,---,a„) 

n 

= pr.it)-' J2 ^^"'"^ E ^(G^iK-., nG-ru<-.^ ) I = a,, , . . . , = a, J, 

k=l l<jl<---<]k<n 

where (Xf , . . . , X^, . . .) is the Polya sequence associated with Dg. 

To conclude, we recall that, in [7], the authors derive an explicit expression of the 
transition density of the Fleming- Viot process (i.e., a measure- valued generalization of 
the Wright-Fisher diffusion) under conditions ensuring that its stationary distribution is 
the law of a general DF process on a compact metric space. The relation between such 
a result and the orthogonal decomposition of [D) introduced in our paper is far from 
straightforward and will be investigated elsewhere. 
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7. Further examples and applications 

(a) Exponential Junctionals. Consider a DF process D with parameter a. We want to 
write the decomposition of the functional 

G = exp(AL>(C)), 

where A is a real constant and C is an element of A such that a{A) > a{C) > 0. This 
implies, according to [8], Proposition 1, that D{C) £ (0,1) with probability 1. Moreover, 
we know that, under P, D{C) has a Beta distribution with parameters {a{C) , a{A\C)) . 
Now, the decomposition of G is given by 

G = E{G) + J2[ /i(G,„)di?«" = iFi(a(C),a(A),A)+^ / h^G,n)dD^^, 

n>l 

where iFi indicates a confluent hypergeometric function of the first kind (see, e.g., [1]) 
and 

n 

/i(G.n)(«i,---,a«) = ^e<"'''^ ^ E{G-E{G)\X,^a,„...,Xk = a,J 

fe=l l<ji<---<]k<n 



k=l l<il< - <Jfc<n L 



iFi a(G) + lc(aj. ), a(^) + k,X 



iFi(a(G),«(A),A) 



where the second equality derives from the fact that, conditioned on (Xj-^ , • ■ • , ^jk)- ^ 
a DF process with parameter a + X]j=i ■ Similar calculations apply to functionals 

of the type 

G = exp[ J2 >^^D{GA, 

\i=l,...,n / 

where (Gi, . . . , G„) is a finite measurable partition of A. 

(b) Approximation by U -statistics. Now, let D he a DF process with parameter a 
and let X be the associated Polya sequence with parameter a. As announced in the 
Introduction, we shall use the results of the previous sections to solve the problem of 
finding the best approximation of an clement of L^{D) by means of a symmetric 
statistic of the finite sequence Xat = (Xi, . . . , Xat), where is a fixed integer strictly 
greater than 1. Now, recall that the space of symmetric elements of L^(XAr) is the direct 
sum of and the first N symmetric Hoeffding spaces associated with Xat. According 
to Proposition 2, this yields that the above-stated problem reduces to the following: find 
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functions hi G Si(X), i = 1, . . . , A^, such that 
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E 



V ,=1 j(,)cyN(i) / 



arg min E 

S,GHi(X),i=l,...,W 



N 



(44) 



To carry out this program, we introduce the coefficients, appearing in the statement 
of Corollary 9 in [18] and defined for n > 1 and r ~ 0, . . . ,n, 



c{r,n,a{A)) = Y[- 



n — r — I + 1 



a{A) + n + l-l 
and note that 

c(0, n, a{A)) = c{n, a{A)), 
where the term on the right is defined in (25). 

Proposition 12. Suppose that F eL^{D) admits the decomposition 
where h(^p^n) GS„(X). Condition (44) is then satisfied by 



ay 



for i ^ I, . . . , N . Moreover, the corresponding quadratic error is given by 

-\ 2 



F-(E(F)+f] ^ /^.(Xj,.,)) 
= ^ c(n,a(^))E[;i(p,„)(X„)2] 

n>Ar+l 



N 



•^E[V,«)(X„)' 



n=l 



c(n, a{A)) — 



E 

r=0 



n\ ( N — n 



r \ n ^ r 



z{r,n,a{A)) 
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Proof. It is sufficient to prove that, for every z > 1 and every N >i, for every hi,gi S 
S.(X), 



E 



By a density argument we can take hi = 0^'^ , given by formula (18) for a certain / G . 
We may then write, using the notation r(i(i),j(i)) := Card(i(i) A Corollary 10 in [18], 
as well as a simple combinatorial argument, 



E 



U h^dD^^ Y 54Xj,,)) =^lim^-^E( '^'(^J(o) E 5.(Xj,J 



1™ 7^ E E E IE(/i,:(Xj(.,).g.(Xi<„)) 



— (f ) 



j(,)eVff(i) '•=Oi(i)ey„(i): 



E(rj ( i-r j c(r,z,a(A))E(/i,(Xj,,)g,(Xj„)) 
-i-Ej Y '^^(^Jc)) E 5.(Xj,,)). 



j(i)eVN(i 

The last formula in the statement is straightforward. □ 

For example, from the calculations in part (a), we obtain that the best approximation 
of G = exp(A£'(C)), by means of ?7-statistics based on X^r, is 

N 

G^=E(G) + ^ Y h^iXiJ 

'=1 j(i)CVjv(i) 

= ii^i(a(G),a(^),A) 

+E E irX^'-'' 

«=ij(i)cyiv(Ofe=i 



X 



E 



li^i a{C) + Y ^c{Xj,),a{A) +k,\ 



j(fc)Cj(i) L \ 1=1 
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-ii^i(a(C),a(A),A) . 

Remark. Suppose that {A, A) = ([0, 1],B([0, 1])), where B stands for the Borel a-field 
and a is equal to the Lebesgue measure. In this case, it is well known that the corre- 
sponding DF process D can be represented as the random probability generated on [0, 1] 
by an increasing process of the type {Gt/Gi :t G [0,1]}, where G is a Gamma process 
on [0,1], that is, a Levy process on [0,1] with Levy measure (see, e.g., [15]) given by 
u{dx) = l(a;>o) exp(— x) Ax/x (observe that the normalized process G/Gi is independent 
of Gi). It follows that every F £ L^{D) is also a member of L'^{G), that is, the space of 
square-integrable functionals of G. It would therefore be interesting to find some explicit 
relation between our orthogonal decomposition of (D) and the chaotic decompositions 
of i^(G) established in, for example, [22] or [15]. 
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